# About

This document contains various data statistics relating to the text and image data of the <https://.arxiv.org> dataset. The entire source of the arxiv dataset was downloaded at the end of 2018. It was then extracted into individual files and queried. The majority of these statistics have been collected through the process of querying the dataset along with the associated metadata provided via the OAI format (see <https://arxiv.org/help/oa>).


# Table of Contents

1.  [About](#org1103506)
2.  [Key](#orgcd24af6)
3.  [General](#orgb342f7e)
    1.  [Data timespan](#org7f7af0a)
4.  [Text](#org186be3d)
    1.  [Total number of articles](#orgdb6c377)
    2.  [Number of PDF-only articles](#org1bcde2c)
    3.  [Number of text files extracted from PDFs](#orgffb88c9)
    4.  [Number of articles that include source (generally in TeX or LaTeX format)](#org1c7a4ad)
    5.  [Number of source-only articles (single source file only)](#orgcf8ad55)
    6.  [Articles with at least one image](#orgca70172)
    7.  [Articles by month/year](#orgdb31bf5)
    8.  [Articles by year](#org7c37215)
    9.  [Number of articles by licence](#org59a1078)
    10. [Number of articles by primary category](#orgfbee4f3)
5.  [Data](#orgdfd4af3)
    1.  [Total data size](#org1f363c8)
    2.  [Number of tar archive files](#org7c268e6)
    3.  [Number of folders with ancillary files provided](#org70323da)
6.  [Images](#orgf713c17)
    1.  [Number of images total](#orgc6ff8a0)
    2.  [Number of images extracted from PDFs](#orgad60aa3)
    3.  [List of all different image file extensions](#org01a5152)
    4.  [Average number of images per article](#orgd73f915)
    5.  [Average size of images](#orgd6c5c65)
    6.  [Primary image formats](#orgcc90ef3)
    7.  [Primary image formats (combined, case-insensitive)](#org5a9623d)
    8.  [Highest number of images for a single article](#orgf70fd7c)
    9.  [Images by primary category](#org92263f6)
    10. [Images by month/year](#orgd35d6db)
    11. [Images by year](#org0a14982)
    12. [Images by image format](#org7459f1a)
    13. [Exif 'creator' metadata](#org2979c2b)


# Key

List of categories: <http://arxitics.com/help/categories>


# General


## Data timespan

04-1986 to 12-2018 (limited number of articles prior to 1991)

~32 years

392 months

~3816 articles per month (mean across entire timespan)

~25668 images per month (mean across entire timespan)


# Text


## Total number of articles

-   arXiv.org: 1,495,708 <span class="timestamp-wrapper"><span class="timestamp">[2019-02-04 Mon]</span></span>
-   metadata->sqlite: 1,506,177 <span class="timestamp-wrapper"><span class="timestamp">[2019-04-12 Fri] </span></span> (metadata downloaded on <span class="timestamp-wrapper"><span class="timestamp">[2019-03-06 Wed]</span></span>)
-   number of folders below the year/month folders: 1,476,538 <span class="timestamp-wrapper"><span class="timestamp">[2019-03-05 Tue]</span></span>

```bash
# find all directories at a particular depth below year/month
find . -type d  -mindepth 2 -maxdepth 2 | wc -l
```


## Number of PDF-only articles

114,132

(114,132 / 1,476,538 = 7.72% of total articles)

```bash
# find all pdf files
find . -name "*.pdf" | wc -l
```


## Number of text files extracted from PDFs

114,131


## Number of articles that include source (generally in TeX or LaTeX format)

1,369,530

(includes source-only, i.e. no images)


## Number of source-only articles (single source file only)

324,101

(324,101 / 1,476,538 = 21.95% of total articles)

```bash
# command to find all .gz files
find . -type f -name "*.gz" | wc -l
# would now need a more complex command to check for this, gz files have been extracted
```


## Articles with at least one image

1,476,538 - 324,101 - 114,132 = 1038305

(1038305 / 1,476,538 = 70.32% of total articles)


## Articles by month/year

| month   | total |
|------- |----- |
| 04-1986 | 1     |
| 11-1988 | 1     |
| 04-1989 | 1     |
| 10-1989 | 3     |
| 11-1989 | 2     |
| 12-1989 | 2     |
| 01-1990 | 4     |
| 02-1990 | 2     |
| 03-1990 | 3     |
| 04-1990 | 1     |
| 05-1990 | 2     |
| 06-1990 | 2     |
| 07-1990 | 2     |
| 08-1990 | 1     |
| 09-1990 | 4     |
| 11-1990 | 1     |
| 12-1990 | 3     |
| 01-1991 | 9     |
| 02-1991 | 3     |
| 03-1991 | 3     |
| 04-1991 | 4     |
| 05-1991 | 4     |
| 06-1991 | 5     |
| 07-1991 | 5     |
| 08-1991 | 29    |
| 09-1991 | 61    |
| 10-1991 | 83    |
| 11-1991 | 67    |
| 12-1991 | 97    |
| 01-1992 | 93    |
| 02-1992 | 129   |
| 03-1992 | 140   |
| 04-1992 | 221   |
| 05-1992 | 234   |
| 06-1992 | 250   |
| 07-1992 | 285   |
| 08-1992 | 231   |
| 09-1992 | 349   |
| 10-1992 | 384   |
| 11-1992 | 453   |
| 12-1992 | 412   |
| 01-1993 | 370   |
| 02-1993 | 435   |
| 03-1993 | 514   |
| 04-1993 | 497   |
| 05-1993 | 540   |
| 06-1993 | 549   |
| 07-1993 | 632   |
| 08-1993 | 546   |
| 09-1993 | 529   |
| 10-1993 | 663   |
| 11-1993 | 718   |
| 12-1993 | 735   |
| 01-1994 | 611   |
| 02-1994 | 655   |
| 03-1994 | 753   |
| 04-1994 | 714   |
| 05-1994 | 852   |
| 06-1994 | 895   |
| 07-1994 | 849   |
| 08-1994 | 764   |
| 09-1994 | 888   |
| 10-1994 | 935   |
| 11-1994 | 1111  |
| 12-1994 | 1058  |
| 01-1995 | 915   |
| 02-1995 | 983   |
| 03-1995 | 1152  |
| 04-1995 | 933   |
| 05-1995 | 1105  |
| 06-1995 | 1163  |
| 07-1995 | 1048  |
| 08-1995 | 1033  |
| 09-1995 | 1102  |
| 10-1995 | 1242  |
| 11-1995 | 1183  |
| 12-1995 | 1135  |
| 01-1996 | 1050  |
| 02-1996 | 1081  |
| 03-1996 | 1159  |
| 04-1996 | 1222  |
| 05-1996 | 1326  |
| 06-1996 | 1347  |
| 07-1996 | 1426  |
| 08-1996 | 1461  |
| 09-1996 | 1425  |
| 10-1996 | 1508  |
| 11-1996 | 1461  |
| 12-1996 | 1409  |
| 01-1997 | 1366  |
| 02-1997 | 1336  |
| 03-1997 | 1379  |
| 04-1997 | 1470  |
| 05-1997 | 1580  |
| 06-1997 | 1707  |
| 07-1997 | 1791  |
| 08-1997 | 1446  |
| 09-1997 | 1854  |
| 10-1997 | 2019  |
| 11-1997 | 1767  |
| 12-1997 | 1906  |
| 01-1998 | 1734  |
| 02-1998 | 1667  |
| 03-1998 | 1913  |
| 04-1998 | 1725  |
| 05-1998 | 1962  |
| 06-1998 | 2065  |
| 07-1998 | 2082  |
| 08-1998 | 1832  |
| 09-1998 | 2424  |
| 10-1998 | 2352  |
| 11-1998 | 2222  |
| 12-1998 | 2196  |
| 01-1999 | 1876  |
| 02-1999 | 1938  |
| 03-1999 | 2357  |
| 04-1999 | 2147  |
| 05-1999 | 2215  |
| 06-1999 | 2452  |
| 07-1999 | 2415  |
| 08-1999 | 2125  |
| 09-1999 | 2484  |
| 10-1999 | 2484  |
| 11-1999 | 2618  |
| 12-1999 | 2583  |
| 01-2000 | 2368  |
| 02-2000 | 2358  |
| 03-2000 | 2602  |
| 04-2000 | 2131  |
| 05-2000 | 2679  |
| 06-2000 | 2431  |
| 07-2000 | 2460  |
| 08-2000 | 2613  |
| 09-2000 | 2550  |
| 10-2000 | 2904  |
| 11-2000 | 2848  |
| 12-2000 | 2728  |
| 01-2001 | 2514  |
| 02-2001 | 2435  |
| 03-2001 | 2744  |
| 04-2001 | 2576  |
| 05-2001 | 2909  |
| 06-2001 | 2893  |
| 07-2001 | 2729  |
| 08-2001 | 2422  |
| 09-2001 | 2612  |
| 10-2001 | 3365  |
| 11-2001 | 3225  |
| 12-2001 | 2703  |
| 01-2002 | 2731  |
| 02-2002 | 2559  |
| 03-2002 | 2707  |
| 04-2002 | 2811  |
| 05-2002 | 3083  |
| 06-2002 | 2753  |
| 07-2002 | 3229  |
| 08-2002 | 2736  |
| 09-2002 | 3291  |
| 10-2002 | 3536  |
| 11-2002 | 3478  |
| 12-2002 | 3188  |
| 01-2003 | 2931  |
| 02-2003 | 2880  |
| 03-2003 | 3023  |
| 04-2003 | 3139  |
| 05-2003 | 3282  |
| 06-2003 | 3414  |
| 07-2003 | 3420  |
| 08-2003 | 2815  |
| 09-2003 | 3675  |
| 10-2003 | 3818  |
| 11-2003 | 3432  |
| 12-2003 | 3560  |
| 01-2004 | 3113  |
| 02-2004 | 3326  |
| 03-2004 | 3531  |
| 04-2004 | 3355  |
| 05-2004 | 3559  |
| 06-2004 | 3723  |
| 07-2004 | 3697  |
| 08-2004 | 3277  |
| 09-2004 | 3931  |
| 10-2004 | 4156  |
| 11-2004 | 4069  |
| 12-2004 | 3981  |
| 01-2005 | 3509  |
| 02-2005 | 3235  |
| 03-2005 | 3893  |
| 04-2005 | 3715  |
| 05-2005 | 3745  |
| 06-2005 | 3992  |
| 07-2005 | 3916  |
| 08-2005 | 3700  |
| 09-2005 | 4343  |
| 10-2005 | 4423  |
| 11-2005 | 4295  |
| 12-2005 | 4096  |
| 01-2006 | 3830  |
| 02-2006 | 3528  |
| 03-2006 | 4190  |
| 04-2006 | 3586  |
| 05-2006 | 4143  |
| 06-2006 | 4098  |
| 07-2006 | 4208  |
| 08-2006 | 4068  |
| 09-2006 | 4335  |
| 10-2006 | 5072  |
| 11-2006 | 4873  |
| 12-2006 | 4371  |
| 01-2007 | 4555  |
| 02-2007 | 4169  |
| 03-2007 | 4492  |
| 04-2007 | 4016  |
| 05-2007 | 4677  |
| 06-2007 | 4513  |
| 07-2007 | 4657  |
| 08-2007 | 4385  |
| 09-2007 | 4840  |
| 10-2007 | 5811  |
| 11-2007 | 5018  |
| 12-2007 | 4635  |
| 01-2008 | 4748  |
| 02-2008 | 4455  |
| 03-2008 | 4533  |
| 04-2008 | 4891  |
| 05-2008 | 4894  |
| 06-2008 | 4929  |
| 07-2008 | 5135  |
| 08-2008 | 4264  |
| 09-2008 | 5193  |
| 10-2008 | 5759  |
| 11-2008 | 4916  |
| 12-2008 | 5078  |
| 01-2009 | 4906  |
| 02-2009 | 4932  |
| 03-2009 | 5484  |
| 04-2009 | 4921  |
| 05-2009 | 5095  |
| 06-2009 | 5487  |
| 07-2009 | 5585  |
| 08-2009 | 4638  |
| 09-2009 | 5688  |
| 10-2009 | 6004  |
| 11-2009 | 5678  |
| 12-2009 | 5658  |
| 01-2010 | 5456  |
| 02-2010 | 5101  |
| 03-2010 | 5981  |
| 04-2010 | 5598  |
| 05-2010 | 5738  |
| 06-2010 | 5972  |
| 07-2010 | 5603  |
| 08-2010 | 5344  |
| 09-2010 | 6200  |
| 10-2010 | 6486  |
| 11-2010 | 6525  |
| 12-2010 | 6279  |
| 01-2011 | 5828  |
| 02-2011 | 5779  |
| 03-2011 | 6286  |
| 04-2011 | 5769  |
| 05-2011 | 6313  |
| 06-2011 | 6371  |
| 07-2011 | 6184  |
| 08-2011 | 6199  |
| 09-2011 | 6909  |
| 10-2011 | 6964  |
| 11-2011 | 7306  |
| 12-2011 | 6696  |
| 01-2012 | 6451  |
| 02-2012 | 6716  |
| 03-2012 | 6989  |
| 04-2012 | 6657  |
| 05-2012 | 7043  |
| 06-2012 | 7194  |
| 07-2012 | 7287  |
| 08-2012 | 6557  |
| 09-2012 | 6849  |
| 10-2012 | 8328  |
| 11-2012 | 7340  |
| 12-2012 | 6973  |
| 01-2013 | 7717  |
| 02-2013 | 7297  |
| 03-2013 | 8001  |
| 04-2013 | 7618  |
| 05-2013 | 7507  |
| 06-2013 | 7159  |
| 07-2013 | 8261  |
| 08-2013 | 6936  |
| 09-2013 | 7977  |
| 10-2013 | 8592  |
| 11-2013 | 7818  |
| 12-2013 | 7981  |
| 01-2014 | 8061  |
| 02-2014 | 7415  |
| 03-2014 | 8243  |
| 04-2014 | 7842  |
| 05-2014 | 7942  |
| 06-2014 | 7841  |
| 07-2014 | 8520  |
| 08-2014 | 7351  |
| 09-2014 | 8514  |
| 10-2014 | 8841  |
| 11-2014 | 8324  |
| 12-2014 | 8696  |
| 01-2015 | 7896  |
| 02-2015 | 8003  |
| 03-2015 | 9017  |
| 04-2015 | 8361  |
| 05-2015 | 8431  |
| 06-2015 | 8974  |
| 07-2015 | 8987  |
| 08-2015 | 8027  |
| 09-2015 | 9310  |
| 10-2015 | 9365  |
| 11-2015 | 9464  |
| 12-2015 | 9280  |
| 01-2016 | 8623  |
| 02-2016 | 8888  |
| 03-2016 | 9711  |
| 04-2016 | 8991  |
| 05-2016 | 9732  |
| 06-2016 | 9570  |
| 07-2016 | 9106  |
| 08-2016 | 8794  |
| 09-2016 | 9857  |
| 10-2016 | 10100 |
| 11-2016 | 10374 |
| 12-2016 | 9665  |
| 01-2017 | 9051  |
| 02-2017 | 8889  |
| 03-2017 | 11032 |
| 04-2017 | 9330  |
| 05-2017 | 10955 |
| 06-2017 | 10217 |
| 07-2017 | 10096 |
| 08-2017 | 9837  |
| 09-2017 | 10605 |
| 10-2017 | 11500 |
| 11-2017 | 11625 |
| 12-2017 | 10556 |
| 01-2018 | 10351 |
| 02-2018 | 10573 |
| 03-2018 | 11625 |
| 04-2018 | 11224 |
| 05-2018 | 12550 |
| 06-2018 | 11652 |
| 07-2018 | 11830 |
| 08-2018 | 10752 |
| 09-2018 | 11607 |
| 10-2018 | 13045 |
| 11-2018 | 12898 |
| 12-2018 | 11837 |
| 01-2019 | 11440 |


## Articles by year

| year | articles |
|---- |-------- |
| 1986 | 1        |
| 1988 | 1        |
| 1989 | 8        |
| 1990 | 25       |
| 1991 | 370      |
| 1992 | 3181     |
| 1993 | 6728     |
| 1994 | 10085    |
| 1995 | 12994    |
| 1996 | 15876    |
| 1997 | 19621    |
| 1998 | 24174    |
| 1999 | 27694    |
| 2000 | 30672    |
| 2001 | 33127    |
| 2002 | 36102    |
| 2003 | 39389    |
| 2004 | 43719    |
| 2005 | 46863    |
| 2006 | 50303    |
| 2007 | 55768    |
| 2008 | 58796    |
| 2009 | 64077    |
| 2010 | 70283    |
| 2011 | 76604    |
| 2012 | 84385    |
| 2013 | 92864    |
| 2014 | 97593    |
| 2015 | 105124   |
| 2016 | 113422   |
| 2017 | 123750   |
| 2018 | 140242   |


## Number of articles by licence

| licence                                               | total   |
|----------------------------------------------------- |------- |
| <http://arxiv.org/licenses/nonexclusive-distrib/1.0/> | 1017997 |
| (none provided)                                       | 453077  |
| <http://creativecommons.org/licenses/by/4.0/>         | 10657   |
| <http://creativecommons.org/licenses/by/3.0/>         | 7944    |
| <http://creativecommons.org/licenses/by-nc-sa/3.0/>   | 5909    |
| <http://creativecommons.org/licenses/by-nc-sa/4.0/>   | 4617    |
| <http://creativecommons.org/licenses/publicdomain/>   | 2485    |
| <http://creativecommons.org/publicdomain/zero/1.0/>   | 1883    |
| <http://creativecommons.org/licenses/by-sa/4.0/>      | 1608    |


## Number of articles by primary category

| primary category   | total       |
|------------------ |----------- |
| acc-phys           | 47          |
| adap-org           | 306         |
| alg-geom           | 1209        |
| ao-sci             | 13          |
| astro-ph           | 94247       |
| astro-ph.CO        | 28674       |
| astro-ph.EP        | 11919       |
| astro-ph.GA        | 25325       |
| astro-ph.HE        | 22574       |
| astro-ph.IM        | 10284       |
| astro-ph.SR        | 28865       |
| atom-ph            | 68          |
| bayes-an           | 11          |
| chao-dyn           | 1770        |
| chem-ph            | 129         |
| cmp-lg             | 894         |
| comp-gas           | 140         |
| cond-mat           | 11357       |
| cond-mat.dis-nn    | 9026        |
| cond-mat.mes-hall  | 44643       |
| cond-mat.mtrl-sci  | 37750       |
| cond-mat.other     | 6224        |
| cond-mat.quant-gas | 9171        |
| cond-mat.soft      | 18858       |
| cond-mat.stat-mech | 31624       |
| cond-mat.str-el    | 34767       |
| cond-mat.supr-con  | 24563       |
| cs.AI              | 9059        |
| cs.AR              | 868         |
| cs.CC              | 3254        |
| cs.CE              | 1566        |
| cs.CG              | 2453        |
| cs.CL              | 8691        |
| cs.CR              | 7133        |
| cs.CV              | 21203       |
| cs.CY              | 3624        |
| cs.DB              | 3003        |
| cs.DC              | 5886        |
| cs.DL              | 1799        |
| cs.DM              | 3140        |
| cs.DS              | 8230        |
| cs.ET              | 867         |
| cs.FL              | 1587        |
| cs.GL              | 72          |
| cs.GR              | 787         |
| cs.GT              | 3515        |
| cs.HC              | 2218        |
| cs.IR              | 2993        |
| cs.IT              | 22021       |
| cs.LG              | 13984       |
| cs.LO              | 6030        |
| cs.MA              | 886         |
| cs.MM              | 1004        |
| cs.MS              | 597         |
| cs.NA              | 1043        |
| cs.NE              | 2783        |
| cs.NI              | 8587        |
| cs.OH              | 1649        |
| cs.OS              | 266         |
| cs.PF              | 592         |
| cs.PL              | 2523        |
| cs.RO              | 3956        |
| cs.SC              | 814         |
| cs.SD              | 1215        |
| cs.SE              | 4402        |
| cs.SI              | 4546        |
| cs.SY              | 4757        |
| dg-ga              | 562         |
| econ.EM            | 368         |
| econ.GN            | 157         |
| econ.TH            | 73          |
| eess.AS            | 365         |
| eess.IV            | 509         |
| eess.SP            | 2279        |
| funct-an           | 320         |
| gr-qc              | 44417       |
| hep-ex             | 18424       |
| hep-lat            | 15022       |
| hep-ph             | 105924      |
| hep-th             | 84481       |
| math-ph            | 24790       |
| math.AC            | 5519        |
| math.AG            | 24434       |
| math.AP            | 26974       |
| math.AT            | 5780        |
| math.CA            | 10400       |
| math.CO            | 26220       |
| math.CT            | 2277        |
| math.CV            | 6990        |
| math.DG            | 19860       |
| math.DS            | 14021       |
| math.FA            | 12330       |
| math.GM            | 2217        |
| math.GN            | 2043        |
| math.GR            | 9127        |
| math.GT            | 10794       |
| math.HO            | 1767        |
| math.KT            | 1837        |
| math.LO            | 6030        |
| math.MG            | 3969        |
| math.NA            | 12835       |
| math.NT            | 19960       |
| math.OA            | 5927        |
| math.OC            | 13562       |
| math.PR            | 25233       |
| math.QA            | 7080        |
| math.RA            | 7037        |
| math.RT            | 10079       |
| math.SG            | 3381        |
| math.SP            | 3264        |
| math.ST            | 8953        |
| mtrl-th            | 165         |
| nlin.AO            | 1745        |
| nlin.CD            | 5575        |
| nlin.CG            | 386         |
| nlin.PS            | 3112        |
| nlin.SI            | 3955        |
| nucl-ex            | 9077        |
| nucl-th            | 26970       |
| patt-sol           | 452         |
| physics.acc-ph     | 4223        |
| physics.ao-ph      | 1727        |
| physics.app-ph     | 2264        |
| physics.atm-clus   | 954         |
| physics.atom-ph    | 8704        |
| physics.bio-ph     | 4132        |
| physics.chem-ph    | 5857        |
| physics.class-ph   | 3395        |
| physics.comp-ph    | 4053        |
| physics.data-an    | 2459        |
| physics.ed-ph      | 1847        |
| physics.flu-dyn    | 9162        |
| physics.gen-ph     | 7418        |
| physics.geo-ph     | 2007        |
| physics.hist-ph    | 2026        |
| physics.ins-det    | 8593        |
| physics.med-ph     | 1826        |
| physics.optics     | 16181       |
| physics.plasm-ph   | 6638        |
| physics.pop-ph     | 889         |
| physics.soc-ph     | 7304        |
| physics.space-ph   | 1236        |
| plasm-ph           | 28          |
| q-alg              | 1177        |
| q-bio.BM           | 1699        |
| q-bio.CB           | 657         |
| q-bio.GN           | 1123        |
| q-bio.MN           | 1612        |
| q-bio.NC           | 3142        |
| q-bio.OT           | 459         |
| q-bio.PE           | 4173        |
| q-bio.QM           | 2432        |
| q-bio.SC           | 529         |
| q-bio.TO           | 640         |
| q-fin.CP           | 597         |
| q-fin.EC           | 395         |
| q-fin.GN           | 998         |
| q-fin.MF           | 668         |
| q-fin.PM           | 623         |
| q-fin.PR           | 900         |
| q-fin.RM           | 669         |
| q-fin.ST           | 1043        |
| q-fin.TR           | 565         |
| quant-ph           | 69124       |
| solv-int           | 844         |
| stat.AP            | 4435        |
| stat.CO            | 2067        |
| stat.ME            | 7864        |
| stat.ML            | 7713        |
| stat.OT            | 333         |
| supr-con           | 69          |
| **total**          | **1506562** |


# Data


## Total data size

2.1 TB

```bash
# calculate disk usage across arXiv/src_all folder
du ~/arXiv/src_all -h --max-depth 1
```


## Number of tar archive files

2150


## Number of folders with ancillary files provided

3343

```bash
# find all folders named exactly "anc"
find . -name "anc" | wc -l
```


# Images


## Number of images total

10,053,059

(total in `filepaths_all_images.txt`)

```bash
# written to a paths text file
# command
find . -type f \( -iname "*.png" -o -iname "*.eps" -o -iname "*.pdf" -o -iname "*.ps" -o -iname "*.jpg" \
-o -iname "*.jpeg" -o -iname "*.pstex" -o -iname "*.gif" -o -iname "*.svg" -o -iname "*.epsf" \) \
-not -name "*pdf_image-*"
# full command in bash script image_paths_to_txt.sh
```

10,061,232

(this is the total number of rows in the sqlite database, written via the find command)

10,061,158

(total number of rows in sqlite database, after cleaning)

10,053,059

(total number of rows in sqlite database, not including null values for x, y, or imageformat)


## Number of images extracted from PDFs

27,198,781


## List of all different image file extensions

For full list, see <https://github.com/re-imaging/re-imaging/blob/master/statistics/file_extension_totals.org>

```bash
# command for finding files using perl
find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u

# or all in one go, getting totals and writing to text file
find . -type f | grep -E ".*\.[a-zA-Z0-9]*$" | sed -e 's/.*\(\.[a-zA-Z0-9]*\)$/\1/' | sort | uniq -c | sort -n > ../format_totals_final.txt
```


## Average number of images per article

6.814069127

(10061232 / 1476538 = 6.814069127)


## Average size of images

615 x 478 pixels

mean across the entire dataset: 614.5988512991947 x 478.21691675858534

calculated using sqlite database


## Primary image formats

bash find

| total        | extension |
|------------ |--------- |
| 4202415      | eps       |
| 3299213      | pdf       |
| 1090973      | png       |
| 905970       | ps        |
| 450816       | jpg       |
| 26164        | jpeg      |
| 25141        | eps       |
| 24190        | pstex     |
| 18496        | gif       |
| 15182        | epsi      |
| 12404        | svg       |
| 11256        | png       |
| 7788         | jpg       |
| 5236         | ps        |
| 3425         | epsf      |
| 1386         | pdf       |
| 919          | jpeg      |
| 606          | gif       |
| **10101580** | **total** |


## Primary image formats (combined, case-insensitive)

From bash find

| total        | extension |
|------------ |--------- |
| 4227556      | eps       |
| 3300599      | pdf       |
| 1102229      | png       |
| 911206       | ps        |
| 485687       | jpg       |
| 24190        | pstex     |
| 19102        | gif       |
| 15182        | epsi      |
| 12404        | svg       |
| 3425         | epsf      |
| **10101580** | **total** |

SQLite

| total        | extension |
|------------ |--------- |
| 4223083      | eps       |
| 3299043      | pdf       |
| 1076731      | png       |
| 909314       | ps        |
| 485452       | jpg       |
| 23922        | pstex     |
| 19054        | gif       |
| 12400        | svg       |
| 4060         | epsf      |
| **10053059** | **total** |

SQLite with percentage

| extension | total        | %           |
|--------- |------------ |----------- |
| eps       | 4223083      | 42.007940   |
| pdf       | 3299043      | 32.816310   |
| png       | 1076731      | 10.710481   |
| ps        | 909314       | 9.0451474   |
| jpg       | 485452       | 4.8288983   |
| pstex     | 23922        | 0.23795742  |
| gif       | 19054        | 0.18953435  |
| svg       | 12400        | 0.12334554  |
| epsf      | 4060         | 0.040385717 |
| **total** | **10053059** | **100**     |


## Highest number of images for a single article

67

article: `1804.11192`


## Images by primary category

171 different primary categories, i.e. first listed subject area

| primary category   | total  | rank |
|------------------ |------ |---- |
| hep-ph             | 814037 | 1    |
| astro-ph           | 742929 | 2    |
| cs.CV              | 536024 | 3    |
| astro-ph.GA        | 414296 | 4    |
| astro-ph.CO        | 394900 | 5    |
| astro-ph.SR        | 368520 | 6    |
| quant-ph           | 307949 | 7    |
| hep-th             | 287747 | 8    |
| astro-ph.HE        | 260679 | 9    |
| cond-mat.mes-hall  | 243985 | 10   |
| cond-mat.str-el    | 242199 | 11   |
| hep-ex             | 225621 | 12   |
| cond-mat.stat-mech | 208411 | 13   |
| nucl-th            | 199725 | 14   |
| gr-qc              | 195447 | 15   |
| cs.LG              | 189391 | 16   |
| math.NA            | 183991 | 17   |
| cond-mat.mtrl-sci  | 176125 | 18   |
| cond-mat.soft      | 150161 | 19   |
| cs.IT              | 148683 | 20   |
| astro-ph.EP        | 143683 | 21   |
| hep-lat            | 129076 | 22   |
| stat.ML            | 128104 | 23   |
| cond-mat.supr-con  | 126255 | 24   |
| astro-ph.IM        | 123009 | 25   |
| math.GT            | 116189 | 26   |
| physics.flu-dyn    | 112735 | 27   |
| math.OC            | 94774  | 28   |
| physics.ins-det    | 92583  | 29   |
| nucl-ex            | 87936  | 30   |
| stat.ME            | 85470  | 31   |
| cs.NI              | 82362  | 32   |
| math.CO            | 81315  | 33   |
| physics.optics     | 72789  | 34   |
| cond-mat.quant-gas | 72275  | 35   |
| physics.soc-ph     | 66901  | 36   |
| math-ph            | 65447  | 37   |
| cond-mat.dis-nn    | 64626  | 38   |
| cs.SI              | 61405  | 39   |
| cs.RO              | 61170  | 40   |
| math.DS            | 59980  | 41   |
| cs.AI              | 55324  | 42   |
| cs.DC              | 54255  | 43   |
| cs.CL              | 53137  | 44   |
| math.AP            | 49096  | 45   |
| nlin.CD            | 48426  | 46   |
| physics.atom-ph    | 48098  | 47   |
| stat.AP            | 47455  | 48   |
| math.PR            | 47398  | 49   |
| physics.comp-ph    | 46275  | 50   |
| cs.CR              | 46057  | 51   |
| physics.plasm-ph   | 45813  | 52   |
| math.ST            | 43833  | 53   |
| cs.SY              | 41649  | 54   |
| cs.DS              | 40134  | 55   |
| cs.SE              | 39910  | 56   |
| cond-mat           | 38925  | 57   |
| nlin.PS            | 37559  | 58   |
| cs.CG              | 36602  | 59   |
| cond-mat.other     | 34865  | 60   |
| physics.chem-ph    | 34722  | 61   |
| cs.DB              | 31349  | 62   |
| math.AG            | 30025  | 63   |
| q-bio.PE           | 30017  | 64   |
| physics.bio-ph     | 27860  | 65   |
| physics.acc-ph     | 27346  | 66   |
| cs.NE              | 26836  | 67   |
| math.DG            | 24935  | 68   |
| stat.CO            | 24724  | 69   |
| physics.data-an    | 24008  | 70   |
| q-bio.NC           | 22542  | 71   |
| math.QA            | 21658  | 72   |
| eess.SP            | 21086  | 73   |
| cs.IR              | 20302  | 74   |
| cs.GR              | 19099  | 75   |
| q-bio.QM           | 18591  | 76   |
| cs.CE              | 17945  | 77   |
| physics.class-ph   | 16750  | 78   |
| cs.GT              | 15922  | 79   |
| cs.DM              | 15523  | 80   |
| cs.LO              | 15016  | 81   |
| cs.NA              | 14941  | 82   |
| cs.CY              | 14680  | 83   |
| math.MG            | 14107  | 84   |
| nlin.AO            | 13874  | 85   |
| cs.HC              | 13853  | 86   |
| physics.gen-ph     | 13623  | 87   |
| physics.geo-ph     | 13167  | 88   |
| physics.ao-ph      | 13132  | 89   |
| math.GR            | 12865  | 90   |
| q-bio.MN           | 11727  | 91   |
| nlin.SI            | 11599  | 92   |
| q-fin.ST           | 11550  | 93   |
| physics.med-ph     | 11345  | 94   |
| q-bio.BM           | 11331  | 95   |
| math.SG            | 11173  | 96   |
| math.CA            | 10697  | 97   |
| cs.MM              | 10358  | 98   |
| math.NT            | 10281  | 99   |
| cs.SD              | 10012  | 100  |
| math.AT            | 9265   | 101  |
| math.RT            | 9238   | 102  |
| eess.IV            | 9033   | 103  |
| cs.PL              | 8763   | 104  |
| cs.CC              | 8591   | 105  |
| cs.ET              | 8549   | 106  |
| physics.app-ph     | 8121   | 107  |
| chao-dyn           | 7958   | 108  |
| math.CT            | 7616   | 109  |
| cs.AR              | 7272   | 110  |
| physics.space-ph   | 7037   | 111  |
| cs.MA              | 6945   | 112  |
| physics.ed-ph      | 6663   | 113  |
| math.HO            | 6652   | 114  |
| q-bio.GN           | 6492   | 115  |
| cs.PF              | 6451   | 116  |
| math.FA            | 6340   | 117  |
| math.CV            | 6208   | 118  |
| q-fin.TR           | 6145   | 119  |
| nlin.CG            | 5789   | 120  |
| cs.MS              | 5764   | 121  |
| physics.atm-clus   | 5550   | 122  |
| cs.OH              | 5514   | 123  |
| math.OA            | 5367   | 124  |
| q-bio.CB           | 5302   | 125  |
| q-fin.GN           | 5120   | 126  |
| q-fin.CP           | 5099   | 127  |
| cs.DL              | 5009   | 128  |
| q-fin.PR           | 4990   | 129  |
| math.SP            | 4888   | 130  |
| q-fin.RM           | 4480   | 131  |
| cs.FL              | 4194   | 132  |
| q-bio.TO           | 3990   | 133  |
| physics.hist-ph    | 3614   | 134  |
| q-bio.SC           | 3286   | 135  |
| econ.EM            | 3238   | 136  |
| q-fin.MF           | 3205   | 137  |
| math.RA            | 3182   | 138  |
| physics.pop-ph     | 2870   | 139  |
| q-fin.PM           | 2731   | 140  |
| math.GM            | 2650   | 141  |
| eess.AS            | 2421   | 142  |
| q-fin.EC           | 2140   | 143  |
| math.AC            | 2138   | 144  |
| patt-sol           | 2113   | 145  |
| stat.OT            | 1875   | 146  |
| math.GN            | 1757   | 147  |
| cs.OS              | 1692   | 148  |
| cs.SC              | 1638   | 149  |
| q-alg              | 1586   | 150  |
| q-bio.OT           | 1478   | 151  |
| cmp-lg             | 1346   | 152  |
| math.LO            | 1311   | 153  |
| adap-org           | 1307   | 154  |
| mtrl-th            | 659    | 155  |
| econ.GN            | 587    | 156  |
| comp-gas           | 579    | 157  |
| math.KT            | 579    | 158  |
| solv-int           | 549    | 159  |
| chem-ph            | 424    | 160  |
| alg-geom           | 419    | 161  |
| econ.TH            | 223    | 162  |
| dg-ga              | 211    | 163  |
| supr-con           | 186    | 164  |
| atom-ph            | 155    | 165  |
| acc-phys           | 119    | 166  |
| cs.GL              | 113    | 167  |
| ao-sci             | 68     | 168  |
| funct-an           | 38     | 169  |
| plasm-ph           | 37     | 170  |
| bayes-an           | 17     | 171  |


## Images by month/year

| month   | total  |
|------- |------ |
| 11-1988 | 11     |
| 01-1990 | 7      |
| 04-1990 | 27     |
| 05-1990 | 92     |
| 09-1990 | 4      |
| 01-1991 | 9      |
| 03-1991 | 6      |
| 04-1991 | 10     |
| 05-1991 | 1      |
| 06-1991 | 7      |
| 08-1991 | 9      |
| 09-1991 | 64     |
| 10-1991 | 39     |
| 11-1991 | 1      |
| 01-1992 | 7      |
| 02-1992 | 20     |
| 03-1992 | 19     |
| 04-1992 | 114    |
| 05-1992 | 83     |
| 06-1992 | 40     |
| 07-1992 | 103    |
| 08-1992 | 36     |
| 09-1992 | 74     |
| 10-1992 | 100    |
| 11-1992 | 188    |
| 12-1992 | 188    |
| 01-1993 | 197    |
| 02-1993 | 149    |
| 03-1993 | 269    |
| 04-1993 | 350    |
| 05-1993 | 534    |
| 06-1993 | 418    |
| 07-1993 | 531    |
| 08-1993 | 511    |
| 09-1993 | 650    |
| 10-1993 | 948    |
| 11-1993 | 1190   |
| 12-1993 | 1138   |
| 01-1994 | 1216   |
| 02-1994 | 1135   |
| 03-1994 | 1447   |
| 04-1994 | 1252   |
| 05-1994 | 1801   |
| 06-1994 | 1911   |
| 07-1994 | 1674   |
| 08-1994 | 1550   |
| 09-1994 | 1849   |
| 10-1994 | 1669   |
| 11-1994 | 2206   |
| 12-1994 | 2426   |
| 01-1995 | 2035   |
| 02-1995 | 1807   |
| 03-1995 | 2242   |
| 04-1995 | 1599   |
| 05-1995 | 1998   |
| 06-1995 | 2310   |
| 07-1995 | 1888   |
| 08-1995 | 2264   |
| 09-1995 | 2314   |
| 10-1995 | 2630   |
| 11-1995 | 2706   |
| 12-1995 | 2970   |
| 01-1996 | 3013   |
| 02-1996 | 3766   |
| 03-1996 | 3296   |
| 04-1996 | 3607   |
| 05-1996 | 4008   |
| 06-1996 | 4201   |
| 07-1996 | 4397   |
| 08-1996 | 4893   |
| 09-1996 | 4578   |
| 10-1996 | 5464   |
| 11-1996 | 5054   |
| 12-1996 | 4807   |
| 01-1997 | 5076   |
| 02-1997 | 4974   |
| 03-1997 | 4648   |
| 04-1997 | 5659   |
| 05-1997 | 5973   |
| 06-1997 | 6467   |
| 07-1997 | 7656   |
| 08-1997 | 5846   |
| 09-1997 | 6970   |
| 10-1997 | 7753   |
| 11-1997 | 7193   |
| 12-1997 | 7498   |
| 01-1998 | 6772   |
| 02-1998 | 6410   |
| 03-1998 | 7823   |
| 04-1998 | 7187   |
| 05-1998 | 8224   |
| 06-1998 | 9845   |
| 07-1998 | 8757   |
| 08-1998 | 7459   |
| 09-1998 | 10178  |
| 10-1998 | 9632   |
| 11-1998 | 9564   |
| 12-1998 | 9811   |
| 01-1999 | 8296   |
| 02-1999 | 8569   |
| 03-1999 | 11452  |
| 04-1999 | 9233   |
| 05-1999 | 9829   |
| 06-1999 | 10328  |
| 07-1999 | 10859  |
| 08-1999 | 9508   |
| 09-1999 | 10635  |
| 10-1999 | 10783  |
| 11-1999 | 11561  |
| 12-1999 | 11136  |
| 01-2000 | 10807  |
| 02-2000 | 10987  |
| 03-2000 | 11485  |
| 04-2000 | 9327   |
| 05-2000 | 12045  |
| 06-2000 | 11373  |
| 07-2000 | 11610  |
| 08-2000 | 11651  |
| 09-2000 | 10320  |
| 10-2000 | 12712  |
| 11-2000 | 12927  |
| 12-2000 | 12616  |
| 01-2001 | 11486  |
| 02-2001 | 11007  |
| 03-2001 | 12499  |
| 04-2001 | 11294  |
| 05-2001 | 13199  |
| 06-2001 | 13272  |
| 07-2001 | 13760  |
| 08-2001 | 11189  |
| 09-2001 | 12099  |
| 10-2001 | 14776  |
| 11-2001 | 13647  |
| 12-2001 | 12547  |
| 01-2002 | 13086  |
| 02-2002 | 11750  |
| 03-2002 | 13358  |
| 04-2002 | 14205  |
| 05-2002 | 14542  |
| 06-2002 | 13629  |
| 07-2002 | 16789  |
| 08-2002 | 12860  |
| 09-2002 | 14776  |
| 10-2002 | 15823  |
| 11-2002 | 16046  |
| 12-2002 | 14949  |
| 01-2003 | 14805  |
| 02-2003 | 14005  |
| 03-2003 | 14668  |
| 04-2003 | 14256  |
| 05-2003 | 16013  |
| 06-2003 | 16509  |
| 07-2003 | 17312  |
| 08-2003 | 14161  |
| 09-2003 | 17667  |
| 10-2003 | 18252  |
| 11-2003 | 16043  |
| 12-2003 | 17114  |
| 01-2004 | 15250  |
| 02-2004 | 17099  |
| 03-2004 | 17894  |
| 04-2004 | 16465  |
| 05-2004 | 17854  |
| 06-2004 | 20144  |
| 07-2004 | 18503  |
| 08-2004 | 17117  |
| 09-2004 | 19438  |
| 10-2004 | 20612  |
| 11-2004 | 20161  |
| 12-2004 | 20131  |
| 01-2005 | 17608  |
| 02-2005 | 16486  |
| 03-2005 | 19846  |
| 04-2005 | 19527  |
| 05-2005 | 19122  |
| 06-2005 | 22451  |
| 07-2005 | 21567  |
| 08-2005 | 18794  |
| 09-2005 | 22753  |
| 10-2005 | 23208  |
| 11-2005 | 21318  |
| 12-2005 | 21203  |
| 01-2006 | 19489  |
| 02-2006 | 17896  |
| 03-2006 | 23669  |
| 04-2006 | 18828  |
| 05-2006 | 21587  |
| 06-2006 | 21854  |
| 07-2006 | 22494  |
| 08-2006 | 21812  |
| 09-2006 | 24613  |
| 10-2006 | 25578  |
| 11-2006 | 26112  |
| 12-2006 | 22846  |
| 01-2007 | 23661  |
| 02-2007 | 21987  |
| 03-2007 | 23706  |
| 04-2007 | 22485  |
| 05-2007 | 25668  |
| 06-2007 | 24426  |
| 07-2007 | 25360  |
| 08-2007 | 24225  |
| 09-2007 | 26571  |
| 10-2007 | 31672  |
| 11-2007 | 27463  |
| 12-2007 | 25980  |
| 01-2008 | 27561  |
| 02-2008 | 25120  |
| 03-2008 | 25970  |
| 04-2008 | 27261  |
| 05-2008 | 27428  |
| 06-2008 | 28252  |
| 07-2008 | 29978  |
| 08-2008 | 25154  |
| 09-2008 | 30985  |
| 10-2008 | 35081  |
| 11-2008 | 28507  |
| 12-2008 | 30994  |
| 01-2009 | 29999  |
| 02-2009 | 27152  |
| 03-2009 | 31566  |
| 04-2009 | 28030  |
| 05-2009 | 30822  |
| 06-2009 | 34584  |
| 07-2009 | 35045  |
| 08-2009 | 31141  |
| 09-2009 | 35056  |
| 10-2009 | 36168  |
| 11-2009 | 33965  |
| 12-2009 | 34971  |
| 01-2010 | 32916  |
| 02-2010 | 30680  |
| 03-2010 | 34933  |
| 04-2010 | 34588  |
| 05-2010 | 34520  |
| 06-2010 | 37563  |
| 07-2010 | 34320  |
| 08-2010 | 33145  |
| 09-2010 | 38881  |
| 10-2010 | 39270  |
| 11-2010 | 42457  |
| 12-2010 | 38161  |
| 01-2011 | 37872  |
| 02-2011 | 35109  |
| 03-2011 | 40708  |
| 04-2011 | 35983  |
| 05-2011 | 38638  |
| 06-2011 | 40226  |
| 07-2011 | 41267  |
| 08-2011 | 41337  |
| 09-2011 | 46899  |
| 10-2011 | 46266  |
| 11-2011 | 48216  |
| 12-2011 | 44847  |
| 01-2012 | 42370  |
| 02-2012 | 44005  |
| 03-2012 | 45168  |
| 04-2012 | 43510  |
| 05-2012 | 46642  |
| 06-2012 | 47912  |
| 07-2012 | 48265  |
| 08-2012 | 46157  |
| 09-2012 | 45269  |
| 10-2012 | 53842  |
| 11-2012 | 49689  |
| 12-2012 | 48001  |
| 01-2013 | 48995  |
| 02-2013 | 45883  |
| 03-2013 | 52934  |
| 04-2013 | 51476  |
| 05-2013 | 50673  |
| 06-2013 | 50448  |
| 07-2013 | 62295  |
| 08-2013 | 52996  |
| 09-2013 | 71950  |
| 10-2013 | 61687  |
| 11-2013 | 55479  |
| 12-2013 | 54234  |
| 01-2014 | 55454  |
| 02-2014 | 53244  |
| 03-2014 | 61297  |
| 04-2014 | 55829  |
| 05-2014 | 60058  |
| 06-2014 | 57758  |
| 07-2014 | 66888  |
| 08-2014 | 55138  |
| 09-2014 | 63416  |
| 10-2014 | 65598  |
| 11-2014 | 65634  |
| 12-2014 | 68876  |
| 01-2015 | 61961  |
| 02-2015 | 61664  |
| 03-2015 | 72438  |
| 04-2015 | 68725  |
| 05-2015 | 70703  |
| 06-2015 | 73845  |
| 07-2015 | 70855  |
| 08-2015 | 64263  |
| 09-2015 | 76662  |
| 10-2015 | 75521  |
| 11-2015 | 84480  |
| 12-2015 | 76998  |
| 01-2016 | 72871  |
| 02-2016 | 74819  |
| 03-2016 | 87150  |
| 04-2016 | 78843  |
| 05-2016 | 86293  |
| 06-2016 | 95666  |
| 07-2016 | 77832  |
| 08-2016 | 75794  |
| 09-2016 | 85315  |
| 10-2016 | 88463  |
| 11-2016 | 93998  |
| 12-2016 | 86732  |
| 01-2017 | 75725  |
| 02-2017 | 76541  |
| 03-2017 | 99462  |
| 04-2017 | 88333  |
| 05-2017 | 99221  |
| 06-2017 | 90892  |
| 07-2017 | 90875  |
| 08-2017 | 91564  |
| 09-2017 | 99620  |
| 10-2017 | 104697 |
| 11-2017 | 107585 |
| 12-2017 | 105499 |
| 01-2018 | 94672  |
| 02-2018 | 102907 |
| 03-2018 | 110683 |
| 04-2018 | 112673 |
| 05-2018 | 117354 |
| 06-2018 | 109180 |
| 07-2018 | 114857 |
| 08-2018 | 110967 |
| 09-2018 | 111968 |
| 10-2018 | 128121 |
| 11-2018 | 130495 |
| 12-2018 | 120037 |


## Images by year

| year | images  |
|---- |------- |
| 1988 | 11      |
| 1990 | 130     |
| 1991 | 146     |
| 1992 | 972     |
| 1993 | 6885    |
| 1994 | 20136   |
| 1995 | 26763   |
| 1996 | 51088   |
| 1997 | 75713   |
| 1998 | 101662  |
| 1999 | 122189  |
| 2000 | 137860  |
| 2001 | 150775  |
| 2002 | 171813  |
| 2003 | 190805  |
| 2004 | 220669  |
| 2005 | 243900  |
| 2006 | 266790  |
| 2007 | 303204  |
| 2008 | 342292  |
| 2009 | 388500  |
| 2010 | 431434  |
| 2011 | 497368  |
| 2012 | 560836  |
| 2013 | 659050  |
| 2014 | 729214  |
| 2015 | 858174  |
| 2016 | 1003842 |
| 2017 | 1130770 |
| 2018 | 1368231 |


## Images by image format

as determined by the ImageMagick identify command (blank means no output from this command)

| format | total   |
|------ |------- |
| PS     | 5149324 |
| PDF    | 3261411 |
| PNG    | 1079044 |
| JPEG   | 484113  |
| GIF    | 18742   |
| PDF612 | 13083   |
| SVG    | 12407   |
| PDF595 | 9874    |
|        | 8117    |
| PS360  | 1967    |
| PS612  | 1688    |
| EPS    | 1643    |
| PS596  | 1099    |
| PDF504 | 709     |
| PDF360 | 644     |
| PDF842 | 602     |
| PS504  | 563     |


## Exif 'creator' metadata

Metadata extracted from all images using the `exiftool`.

| Creator          | Total   | %     |
|---------------- |------- |----- |
| (none)           | 1997457 | 19.87 |
| MATLAB           | 876177  | 8.72  |
| Mathematica      | 492318  | 4.90  |
| matplotlib       | 491001  | 4.88  |
| IDL              | 404852  | 4.03  |
| gnuplot          | 396484  | 3.94  |
| cairo            | 388108  | 3.86  |
| fig2dev          | 349381  | 3.48  |
| SM               | 268902  | 2.67  |
| ROOT             | 265278  | 2.64  |
| Illustrator      | 263934  | 2.63  |
| Grace            | 237719  | 2.36  |
| dvips            | 232165  | 2.31  |
| TeX              | 209613  | 2.09  |
| GIMP             | 207108  | 2.06  |
| Ghostscript      | 199064  | 1.98  |
| OriginLab        | 168350  | 1.67  |
| HIGZ             | 144720  | 1.44  |
| R                | 143164  | 1.42  |
| PGPLOT           | 128704  | 1.28  |
| ImageMagick      | 123697  | 1.23  |
| CorelDRAW        | 91453   | 0.91  |
| jpeg2ps          | 87546   | 0.87  |
| PScript5         | 77136   | 0.77  |
| Photoshop        | 76648   | 0.76  |
| Acrobat          | 72191   | 0.72  |
| PowerPoint       | 50187   | 0.50  |
| XV               | 47320   | 0.47  |
| Ipe              | 43498   | 0.43  |
| Keynote          | 37964   | 0.38  |
| xmgr             | 37831   | 0.38  |
| PSCRIPT          | 36755   | 0.37  |
| inkscape         | 32036   | 0.32  |
| OmniGraffle      | 30788   | 0.31  |
| LaTeX            | 30473   | 0.30  |
| Preview          | 24770   | 0.25  |
| GraphicConverter | 24124   | 0.24  |
| FreeHEP          | 23621   | 0.23  |
| GTVIRT           | 20680   | 0.21  |